Skip to content

Don't build prop table during json_encode(JsonSerializable) #9703

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

TysonAndre
Copy link
Contributor

@TysonAndre TysonAndre commented Oct 10, 2022

Save memory and time by not building/updating the property table.

When performing infinite recursion detection on objects, always check it on the object rather than its property table.

This reduces the additional memory usage for both internal and userland implementations of JsonSerializable caused by json_encode. It does this by avoiding creating the properties table for the first time. (Classes such as SplFixedArray both implement JsonSerializable and override get_properties)

  • EDIT: For datastructures such as SplFixedArray, the call to get_properties currently also takes time and may have side effects(destructors), because it needs to update/delete the values to match the backing array. Maybe SplFixedArray and similar datastructures should switch to get_properties_for, so that get_mangled_object_props returns only properties and not fields

This change also makes the method of infinite recursion detection consistent with the special case for standard classes from
f9f8c1c

Also mentioned in #8046 (comment) - a similar change was already made to var_export/debug_zval_dump

@cmb69
Copy link
Member

cmb69 commented Oct 10, 2022

@TysonAndre, please check the failing ext/standard/tests/class_object/get_object_vars_variation_004.phpt; maybe only the test expection needs to be updated.

@cmb69 cmb69 requested a review from bukka October 10, 2022 14:39
Save memory and time by not building/updating the property table.

When performing infinite recursion detection on objects,
always check it on the object rather than its property table.

This reduces the additional memory usage for both internal and userland
implementations of JsonSerializable caused by json_encode.
It does this by avoiding creating the properties table for the first time.
(Classes such as SplFixedArray both implement JsonSerializable and override
get_properties)

This change also makes the method of infinite recursion detection consistent
with the special case for standard classes from
f9f8c1c
@TysonAndre TysonAndre force-pushed the json_encode-JsonSerializable-optimization branch from e7adcaf to 0765e92 Compare October 10, 2022 17:29
@TysonAndre
Copy link
Contributor Author

please check the failing ext/standard/tests/class_object/get_object_vars_variation_004.phpt; maybe only the test expection needs to be updated.

That's expected - var_dump was using the gc flags on the object to detect infinite recursion, and json_encode is now using the gc flags on the object to detect infinite recursion.

The intent of get_object_vars_variation_004.phpt looks like it's to check that the right property fields are exported in get_object_vars, for properties with names that resemble protected/private properties that are not actually protected/private properties dd9ad09


I guess that this change would also mean that calling var_export($this, true) inside of jsonSerialize would now report recursion, but think most use cases wouldn't do anything along those lines, and the var_export/print_r format doesn't seem that useful to jsonSerialize

#define GC_PROTECTED                (1<<5) /* used for recursion detection */

If this or other refactorings in 8.3-dev affect major frameworks, there's the option to track infinite recursion for json_decode separately from var_export by using a different flag bit

@bukka
Copy link
Member

bukka commented Oct 23, 2022

So this is actually not a new idea (read https://bugs.php.net/bug.php?id=81524 ). The problem is obviously that you break this (which I tried with your patch and it's really broken):

class X implements JsonSerializable {
    public $prop = "value";
    public function jsonSerialize(): mixed {
        var_dump($this);
        return ['p' => $this->prop];
    }
}

echo json_encode(new X())

Basically there are for sure users that use var_dump($this) inside jsonSerialize (just for the debugging sake for example) so this would be immediately reported as a bug.

@nikic tried to handle it here #7589 but it was too slow. You could maybe try his suggestion in #7589 (comment) :

switch to a two-level recursion protection: The first user can make use of the GC_PROTECTED flag, while a nested second user would fall back to an HT instead. This would ensure that there is both little performance impact for the average case, and that there is no interference between different recursers. This would be a more intrusive change that needs to cover other users of GC_PROTECTED as well though.

Copy link
Member

@bukka bukka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted the var_dump($this) inside jsonSerialize needs to be supported.

@TysonAndre
Copy link
Contributor Author

TysonAndre commented Oct 23, 2022

@nikic tried to handle it here #7589 but it was too slow.

I hadn't seen that before. I wasn't clear on the expected behavior for var_dump - you'd have similar issues with __debugInfo and var_export, but since there are users that want this I'll avoid merging this

#define GC_PROTECTED (1<<5) /* used for recursion detection */
If this or other refactorings in 8.3-dev affect major frameworks, there's the option to track infinite recursion for json_decode separately from var_export by using a different flag bit

I see that was also mentioned in a comment there. #7589 (comment) was discouraged since using a bit flag just for json seemed excessive

That PR was using pointers as hash buckets directly, which leads to a lot of hash collisions (e.g. being aligned to 16 bytes in practice would mean all values would be hashed to the same 1 in 128 hash buckets 1 in 16 - (low bits of pointers are to bytes))

@TysonAndre TysonAndre marked this pull request as draft October 23, 2022 18:07
@bukka
Copy link
Member

bukka commented Aug 26, 2023

This was addressed by 53aa53f

@bukka bukka closed this Aug 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants